We are largely cannibalizing 1_5_sapflow_workup.Rmd, but since we want to do our stats grouped by species, we’ll need to incorporate that info at the start of the workflow, then carry it through, which is why we’re doing a full, separate workflow.
The goal of this workup is to see if sapflow changed after flooding during the TEMPEST2 flooding events in 2023. To do this, I’m taking data from before, during, and after the flooding events, converting them to Fd, then normalizing them following McDowell et al. 2006 Figure 3.
First, here are the raw data we will use. There are a couple days missing which were scrubbed due to non-stationary voltage:
Once converted to Fd following formulas from Steph/Nate, the data look like this:
Next, let’s color-code our dataset by time period relative to the flooding events to visualize the portions of the dataset we will be comparing:
Let’s look at what time of day Fd is at its maximum
We know that maxima are generally around noon, but vary from pre 8am to post 4p in some cases. Since it was recommended by Miznar to pick one window for all days/sites/sensors, let’s see how much we lose if we ignore day-sensor combos with maxes outside of 9a to 3p:
I tried a number of windows, the base being 1100-1300 which seems to be standard. I then increased to +/- 2 hrs, 2.5 hrs, and 3 hrs. Since there are some maxima outside of +/- 3 hrs, and compared the maximum for Fd from the full day to each of these windowing appraoches to find a balance between following protocol (pick a window for calculating sapflow) and accounting for high variance in the dataset. I subtracted each of the windows from the daily max to understand the difference each window size made. I think calculated p-values for each of these windowing appraoches relative to the largest window (+/- 3 hours) to detemrine if any windows shorter than 3 hours could be used without significantly increasing the difference between actual max and windowed max. Our Goldilocks here appears to be the 2.5-hour window, which was not significantly different for any plot from the 3-hour window, so we’ll use that going forward.
There seem to be many ways to process these data. Since we are really interested in understanding if there is a change in the treatments relative to the control after flooding, we’ll follow the methods in McDowell et al. 2006 Figure 3. The general steps are
normalize each plot to the mean value prior to disturbance, and then
normalize treatment plots by date to the Control.
Since our data have different dimensions (i.e., not just one control site), we will do some binning first to simply things. First, let’s ID the period of maximum sapflow. Usually this is around noon, but we’d like to ID when the maximum is across the dataset. I want to make sure this makes sense, so let’s look at this by plot and by species:
That looks crazy… but let’s power through.
The next step is the normalization step #1 above: normalize each plot to the mean of the pre-flood values, basically setting a baseline of 0 for each plot:
Now for the second normalization: we’ll subtract the value of Control for each day from the value for each Treatment plot:
Let’s summarize this a little differently:
First, reading in PAR data for use later
We’ll put PAR aside for now, but will see if there are correlations later to explain any potential variability in sapflow.